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1 Introduction 

Advanced aerospace electronics systems require high-speed, low-power, radiation-hard, 
digital components for signal processing, control, and communication applications. GaAs 
VLSI devices provide a number of advantages over silicon devices including higher carrier 
velocities, ability to integrate with high performance optical devices, and high-restivity 
substrates that provide very short gate delays, good isolation, and tolerance to many 
forms of radiation. However, III-V technologies also have disadvantages, such as lower 
yield compared to silicon MOS technology. 

Achieving very large scale integration (VLSI) is particularly important for fast complex 
systems. At very short gate delays (less than 100 ps), chip-to-chip interconnects severely 
degrade circuit clock rates. Complex systems, therefore, benefit greatly when as many 
gates as possible are placed on a single chip. To fully exploit the advantages of GaAs 
circuits, attention must be focused on achieving high integration levels by reducing power 
dissipation, reducing the number of devices per logic function, and providing circuit designs 
that are more tolerant to process and environmental variations. In addition, adequate noise 
margin must be maintained to ensure a practical yield. 

2 Applications 

Specific applications of GaAs ICs are in fiber optic communications and digital signal 
processing. The use of fiber optics on board aircraft and spacecraft provide significant 
reductions in weight. GaAs electronics have achieved fiber optic data rates well beyond 
1 Gigabit per second. GaAs circuits can also benefit aerospace applications in the high 
speed data processing by occupying a smaller volume, and reducing power dissipation and 
thus saving weight. Although ECL technology can come close to the speed of GaAs, its 
power dissipation is much higher. CMOS technology can be used in some applications 
by processing data in parallel at the expense of larger volume. Some applications require 
low latency and must be performed at a high data rate, thus eliminating parallel solutions 
entirely. ECL technology can come close to the speed of GaAs, but has higher power 
dissipation. 
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3 Floating Point Multiplier 

We designed a 32-bit floating point multiplier to investigate the yield and performance of 
GaAs VLSI for applications in digital signal processing. With over 10,000 equivalent gates, 
the multiplier approaches the current complexity limits of GaAs. It also provides a good 
example of a GaAs VLSI integrated circuit targeted for aerospace applications. 

The multiplier accepts normalized 32-bit floating point numbers expressed in the IEEE 
Standard 754, version 8.0 or 10.0 single precision format[l]. GaAs 1 micron E/D MES- 
FET technology was chosen because of the maturity of the fabrication process for LSI 
production. Operation over the full military temperature range is required. 

4 GaAs Logic Family Considerations 

There are several logic families that are commonly used to design GaAs E/D MESFET 
circuits. In choosing a logic family, we were most concerned about noise margin. At these 
high integration levels, noise margin must be higher due to increased device variations, 
power-bus noise and crosstalk on signal lines. This high noise margin must be sufficient 
over the entire military temperature range to ensure adequate yield. We also wanted single 
supply operations. 

Families that meet the above criteria are source- coupled FET logic (SCFL)[2] Gain 
FET logic (GFL)[3] and FET-FET logic (FFL)[4). FFL was invented at Boeing and has a 
better delay power product than either GFL or SCFL. Although SCFL has a much higher 
power dissipation, it can perform very complex logic functions which include implementing 
a full adder with two gates and providing the sum and carry outputs in only one gate delay 
each. 


Adder Type 

Device 

Count 

Sum 

Delay.ps 

Carry 

Delay.ps 

Adder 
power .mW 

Wallace Tree 
delay.ns 

Wallace tree 
delay power 
product 

Nor FFL 

91 

480 

290 

9.6 

1.9 

18.2 

Complex FFL 

34 

585 

282 

2.1 

2.0 

4.2 

SCFL 

44 

460 

450 

2.4 

1.8 

4.3 

Complex GFL 

39 

750 

395 

1.8 

2.6 

4.7 


Table 1: Comparison of Different Adder Designs 


5 Full Adder Design 

The most important building block of the multiplier is the full adder. The design of the full 
adder is a determining factor in the final speed and power dissipation of the chip. Figure 
1 shows the schematic of an all NOR implementation of a full adder. It requires 12 gates 
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Figure 1: NOR Impiem&ntation of Full Adder 


to implement and the carry and sum are generated in 2 and 3 gate delays, respectively. 
This design of a full adder result in high device count and high power. 

A complex AND/NOR gate full adder was designed using FFL and GFL gates (fig. 2). 
This full adder requires only 2 complex gates and the carry and sum are generated in 1 
and 2 gate delays respectively. As a comparison, an SCFL full adder was designed (fig. 3). 
The SCFL adder was designed to have comparable power dissipation to the complex FFL 
and GFL adders. 

It takes 3 sum delays and 1 delay delay for the Wallace adder tree to reduce 13 partial 
products to 3. Table 1 shows device count, the sum and carry delay, power dissipation, 
as well as the Wallace tree delay for the different adder design under nominal processing 



Figure 2: Complex AND/NOR Full Adder 
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Figure 3: SCFL Adder 


conditions. 

The all NOR implementation has much higher device count and power than the other 
designs. 

We choose FFL with the complex gate full adder approach to implement the multiplier. 
FFL has the lowest delay-power product for the Wallace tree and the smallest device 
count. GFL has 15% lower performance than FFL with comparable layout area. SCFL is 
comparable in delay-power performance to FFL but requires a substantially larger layout 
area. The area is larger mainly because SCFL is a differential logic family and requires 
two interconnects between gates instead of one. 

6 Multiplier Architecture 

Figure 4 shows a simplified block diagram of the floating point multiplier. The chip has 
a 4-stage pipeline architecture employing high-speed pass-transistor pipeline latches. The 
32-bit inputs are screened for invalid inputs and the signs of the numbers are multiplied by 
an exclusive- or gate. The exponent adder performs addition of the two 8- bit exponents 
and outputs the sum, as well as the sum incremented by 1 for the possible right shift of 
the 24-bit mantissa result. The modified Booth Encoder produces a 69-bit code from the 
multiplicand. 

Thirteen 26-bit partial products are generated by the partial product generator and 
are reduced to three partial products by the first Wallace Tree. The second Wallace 
Tree further reduces the three partial products to two and the look- ahead- carry generator 
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Figure 4: Block Diagram of Floating Point Multiplier 
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generates the carries for the final adder. Two rounding modes are available: round to the 
nearest and round toward zero. The result from the final adder is rounded, checked for 
overflow and underflow, and renormalized into the 23-bit mantissa product. The correct 
8-bit exponent result is then chosen and the 32-bit (sign bit, 8- bit exponent, and 23-bit 
mantissa) product is obtained. 

7 Simulated Results 

Automatic placement and routing of FF1 standard cells was used to lay out the circuit. 
Interconnect capacitances were then extracted. The critical paths were resimulated and 
found to be less than 3 ns between latches. Operation near 350 MFLOPS is expected 
for TriQuint Semiconductor’s 1 micron E/D MESFET process. Power dissipation will be 
under 4.5 W. The die size is about 7.5 mm by 8 mm with about 40,000 devices. 

8 Conclusion 

The design of a GaAs VLSI floating point multiplier was described. The chip is expected 
to perform multiplication at data throughput rates of about 350 MHz when the pipeline 
latches are enabled. With the pipeline latches disabled, the multiplier will operate at about 
110 MHz. 

The high-speed, low-power and radiation hardness of the multiplier will demonstrate 
the benefits of using GaAs VLSI for aerospace, electronics. 
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